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Abstract 

The complexity of constraints is a major obstacle for constraint-based software verification. Au- 
tomatic constraint solvers are fundamentally incomplete: input constraints often build on some unde- 
cidable theory or some theory the solver does not support. This paper proposes and evaluates several 
randomized solvers to address this issue. We compare the effectiveness of a symbolic solver (CVC3), 
a random solver, three hybrid solvers (i.e., mix of random and symbolic), and two heuristic search 
solvers. We evaluate the solvers on two benchmarks: one consisting of manually generated con- 
straints and another generated with a concolic execution of 8 subjects. In addition to fully decidable 
constraints, the benchmarks include constraints with non-linear integer arithmetic, integer modulo 
and division, bitwise arithmetic, and floating-point arithmetic. As expected symbolic solving (in par- 
ticular, CVC3) subsumes the other solvers for the concolic execution of subjects that only generate 
decidable constraints. For the remaining subjects the solvers are complementary. 


1 Introduction 

Software testing is important and expensive [8, 28, 35]. Several techniques have been proposed to reduce 
this cost. Automation of test data generation, in particular, can improve testing productivity. Random 
testing [13, 30] and symbolic testing [25] are two widely used techniques with this goal and with well- 
known limitations. On the one hand, random testing fails to explore a search space in a systematic 
manner: it can explore the same program path repeatedly and also fail to explore important paths (i.e., 
paths to which only a small portion of the space of input data can lead to an execution). On the other 
hand, pure symbolic testing is problematic for indexing arrays, dealing with native calls, detecting infinite 
loops and recursion, and, especially, dealing with undecidable constraints. Combined random-symbolic 
testing [22] has been recently proposed to circumvent these limitations. One important limitation it 
attempts to address is the incapability of solving general constraints. This is the focus of this paper. We 
study the impact of alternative randomization strategies for solving constraints. In this setting, random- 
symbolic testing reduces to random-symbolic constraint solving. 

One possible way to combine random and symbolic solvers is to first delegate to the random solver 
the parts of a constraint that build on theories a symbolic solver does not support. Then use the solution 
to simplify the original constraint. And finally combine the random solution with the one obtained from 
calling the symbolic solver on the simplified constraint. (For simplicity, we assume the constraint is sat- 
isfiable and that the random solver can find a solution.) Important to note is that, as for typical decision 
procedures in SMT solvers [20, 37] , random and symbolic solvers are not independent in this combina- 
tion; they collaborate. One practical consequence of this is that the more constraints the symbolic solver 
rejects the more complex random solving becomes, and conversely. Therefore, random solving is critical 
for the effectiveness of the combined solver. 

We define recall as the fraction of constraints that a solver can find solutions out of the total number 
of satisfiable constraints. (We classify aproximately a constraint as satisfiable if at least one solver can 
find solution to it.) This metric quantifies completeness. Our goal is to increase recall. This paper makes 
the following contributions: 
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• The proposal and implementation of several randomized constraint solvers. We implemented a 
plain random solver, three hybrid constraint solvers, and two search-based solvers. We use the 
random solver and the symbolic solver (in our case, CVC3 [1]) as baselines for comparison; 

• Empirical evaluation of solvers with manually constructed constraints and constraints generated 
with a concolic execution of 8 subjects. 


2 Technique: Randomized Solvers 

This section presents randomized solvers with common input-output interface. Input. All solvers take 
as input (i) a constraint pc (in reference to a path condition from a symbolic execution), (ii) a random 
seed 5 , and (iii) a range of values [lo,hi\. An input constraint takes the form f\bi, where bj is a boolean 
expression constructed, in principle, with any logical system. For example, the expression x > 0 A x > 
y + 1 illustrates a valid input constraint. We often use the term constraint alone or clause in reference 
to a single boolean expression b, and constraint system or pc in reference to the conjunction of all 
constraints. Output. A solution is a vector of variable assignments that satisfies one input constraint. 
For instance, (x i— ► 2 ,y i— > 0) is a solution to the constraint x >y+ 1 (using integer variables). A solver 
returns a solution when it finds one or the flag empty otherwise. 

Note on implementation. We wrote all solvers in the Java language, used the BCEF library [14] to 
instrument the bytecode of the experimental subject for concolic execution, and used part of the code 
from the JPF symbolic execution [5] for the integration with CVC3. 

2.1 Baseline solvers 

We use the solvers ranSOL and symSOL as representatives of plain random and symbolic solvers re- 
spectively. In our experiments we use these solvers as baselines for comparison. Figure 1 shows the 
pseudo-code for a random constraint solver ranSOF. The main loop generates random input vectors and 
selects those that satisfy pc (lines 1-6). The expression vars( pc) denotes the set of variables that occur 
in pc. Function random selects random integer values in the range [lo,hi] and builds assignments to 
each variable in this set (line 2). (For simplification, we only show the case for integers.) The function 
eval[pc, iv) checks whether the candidate solution iv models pc. This function evaluates the concrete 
boolean expression that pc encodes using the variable assignments in iv . ranSOF returns iv at line 4 
if it is a solution to pc, or returns empty on timeout. Symbolic constraint solvers are complete for a set 
of decidable theories. For example, CVC3 [11 supports rational and integer linear arithmetic (among 
others). However, these solvers are incomplete for solving constraints with non-linear arithmetic, integer 
division and modulo whose theories are undecidable. We use the label symSOL to refer to a symbolic 
solver. We used CVC3 in our experiments. 

2.2 Heuristic search solvers 

This section discusses two solvers based on well-known heuristic search techniques: genetic algorithms 
(GA) [23] and particle swarm optimization (PSO) [24]. Conceptually, these solvers attempt to optimize 
the random search that ranSOF drives. The basic task of these algorithms is to search a space of candidate 
solutions to identify the best ones in terms of a problem- specific fitness function. The search process 
usually starts with the selection of randomly-chosen individuals (i.e., candidate solutions to the search 
problem) in the search space. The search proceeds by making movements on each individual iteratively 
with search operators until the search meets some stop criteria (e.g., the result is good enough or the 
search time expired). The decision to move individuals in the search space depends on the evaluation of 
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Input: path condition pc 
Input: random seed 5 , range [lo,hi\ 

1: while -1 timeout do 

2: iv <£= random(vars(/)r). range) 

3: if eval(pc, iv) then 

4: return iv 

5: end if 

6 : end while 
7: return empty 

Figure 1 : Random (ranSOL) 

Input: path condition pc 

Input: random seed 5 , and range [lo,hi\ 

1: ( pcgood , pcbad) -4= partition! pc) 

2: sols -4= eRdnSOL.solve(pcbad,seed,range) 
3: for all ivj in sols do 
4: newpc 4= pcgood\ivj 

5: h ’2 4= symSOL.solve(nevypc) 

6: if h ’2 ^ empty then 

7: return iv 1 + h ’2 

8 : end if 

9: end for 
10: return empty 

Figure 3: Bad constraints first (BCF) 


Input: path condition pc 

Input: random seed 5 , and range [lo,hi\ 

1: ( pcgood , pcbad) 4= parti tiont pc) 

2: ivj 4= symSOL.sol wipe good ) 

3: ii (iv 1 = empty) then 
4: return empty 

5: end if 

6: newpc <= pcbad\iv 1 

l: h ’2 4= ranSOL. aolve(newpc, seed, range) 

8: return iv 2 — empty ? empty : ivj + h ’2 

Figure 2: Good constraints first (GCF) 

Input: path condition pc 

Input: random seed 5 , and range [lo,hi] 

1: ( goodvars , badvars) 4= partition! pc) 

2: while -1 timeout do 
3: iv 1 4= random(badvars) 

4: newpc <= pc\ivj 

5: h ’2 4= symSOL.sol vc( newpc) 

6: if h ’2 ^ empty then 

7: return ivj + iv 2 

8 : end if 

9: end while 

Figure 4: Bad variables first (BVF) 


their current fitness values. The principle of these algorithms is that the movements across successive 
iterations will approximate the individuals to the solution space, i.e., each iteration potentially explores 
better regions in the search space. We discuss next two common aspects to GA and PSO central to 
our domain of application: (i) the representation of a solution (individual) and (ii) the fitness function. 
Representation of a (candidate) solution. One solution to a constraint solving problem is a mapping 
of variables in the constraint system to a concrete value from its domain. For instance, (x 1 — > 2 ,y 1 — > 0} 
is a solution to x > y + 1 (using integer variables). Fitness function. The fitness function serves to 
evaluate the quality of candidate solutions. Two functions have been widely used for constraint solving 
problems: MaxSAT [17, 27, 33] and Stepwise Adaptation of Weights (SAW) [6, 16]. MaxSAT is a 
simple heuristic that counts the number of clauses that can be satisfied by a solution. Maximum fitness is 
obtained when the solution satisfies all clauses (boolean expressions) in a constraint system (conjunction 
of clauses). The main issue with MaxSAT is that the solver can sometimes favor solutions that satisfy 
several easy-to-solve constraints at the expense of solutions that satisfy only a few hard-to-solve. Back 
et al. proposed SAW [6] to reduce the impact of this issue. SAW associates a weight to each clause in 
a constraint. Each weight is updated with each iteration when it is not satisfied. The use of SAW helps 
to identify harder- to- solve clauses with the increase of iterations. The solver can use this information to 
favor individuals (i.e., to reduce movements on those individuals) that are more fit to satisfy harder to 
solve clauses. We used SAW to evaluate fitness in our GA and PSO implementations. 

Summary of GA and PSO. A GA search starts with a population of individuals randomly selected 
from the search space. Each iteration produces a new population with special operators: a crossover 
combines two individuals to produce others and a mutation changes one individual. The individuals are 
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probabilistically selected considering their fitness values. Similar to GA, PSO operates with an initial 
random population of candidate solutions called particles. The interactive collaboration of particles to 
compute a solution is the main difference between GA and PSO. Each particle has a position in the search 
space and a contribution factor to the population, typically called velocity, which PSO uses to update the 
next position of each particle. A typical PSO iteration updates the velocity of a particle according to 
global best and local best solutions. The next position of a particle depends on the old position and 
the new computed velocity. The mutually-recursive equations below govern the update of velocity and 
position across successive iterations t. 


Vf +1 » co * v, + ri * Cl * (best part -x t ) + r 2 *c 2 * ( best pop - x t ) 

Xf+l =X,+V t+ 1 

Figure 5: Update of velocity and position in Particle Swarm Optimization (PSO). 

The vectors v and x store respectively velocities and positions for each particle. We use the label t 
to refer to one iteration. This label is not the index of the vectors. The coefficient co , typically called 
inertia, denotes the fraction of velocity in iteration (instant) t that the particle will inherit in iteration 
t+ 1. Coefficients r\ and r 2 are numbers within the range [0,1] randomly generated according to some 
informed distribution. The vector best part stores the best solution each particle visited and c\ indicates 
the confidence level to local solutions (i.e., to one individual particle). The term best pop indicates the 
best solution in the population and c 2 indicates the confidence level to global solutions. Note that the 
position of a particle at instant t+ 1 is computed by simply adding the contribution (velocity) v t +\. 

2.3 Hybrid solvers 

This section describes solvers that conceptually combine ranSOL and symSOL. These hybrid solvers 
make different decisions in (i) what to randomize and in (ii) which order. Note on terminology. We 
use the term eRanSOL in reference to an extension of ranSOL that can return many solutions. We use 
the term pc\ iv to denote a substitution of variables in pc with their concrete values in iv . For example, 
(jc > 0 A x > y+ 1)\(jc 2} is equivalent to (2 > 0 A 2 > y + 1). 

Good constraints first (GCF). Figure 2 shows the pseudo-code for the GCF solver. At line 1 , the solver 
partitions the constraint pc in two: the first, named pcgood , contains decidable constraints. The second, 
pcbad , complements the first with undecidable constraints. Recall that pc consists of a conjunction of 
boolean expressions. The algorithm reduces to plain random solving if pcgood is empty and to plain 
symbolic solving if pcbad is empty. (We omit these checks for simplicity.) When both parts are non- 
empty, the combined solver uses the symbolic solver to first find a solution to pcgood (line 2). As pcgood 
only contains decidable constraints, an empty answer from symSOF indicates that pcgood is unsatisfi- 
able (lines 3-5). Consequently, pc is also unsatisfiable since -i pcgood implies -i pc (from the partition 
function). In case symSOF finds a solution, the solver produces the constraint newpc with the substitu- 
tion pcbad\ivj. If the random solver can find one solution to newpc GCF returns ivj + iv 2 as solution, 
i.e., variable assignments that the symbolic and random solvers produced, respectively. For illustration, 
GCF partitions the constraint b % a^O !\a> Ovo two: pcgood — a> 0 and pcbad — h % a ^ 0. (The 
modulo operator makes the constraint undecidable.) GCF passes pcgood to the symbolic solver, and uses 
the solution, say (x i — ? 2), to simplify pcbad and finally call the random solver on h % 2^0. 

Bad constraints first (BCF). Figure 3 shows the pseudo-code for the BCF solver. It differs from GCF 
in the order of randomization: it attempts to solve the undecidable parts first. BCF uses eRanSOF to 
find many solutions to pcbad. The main loop checks for each solution ivj whether symSOF can find a 
solution to pcgood\ivj (lines 3-9) . Note that, differently from GCF, BCF calls symSOF once in each 
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iteration. This algorithm corresponds to the one we discussed in Section 1. 

Bad variables first (BVF). Figure 4 shows the pseudo-code for the BVF solver. It is similar to BCF in 
the order of calls to random and symbolic solvers. However, it partitions the problem differently. While 
the previous hybrid solvers partition the set of clauses from one input constraint, BVF partitions the set 
of variables that occur in that constraint. For example, BVF randomizes only the variable b to solve the 
constraint a = b 2 + c, while BCF and GCF randomizes all variables in this case as they appear in a clause 
involving non-linear arithmetic. BVF is similar to the one proposed in DART [22] as it randomizes a 
selection of variables for making the constraint decidable. DART, however, randomizes variables incre- 
mentally from left to right in the order they appear in the constraint. The constraint a — b 2 A... Ab = a 2 
illustrates one diference between BVF and DART. BVF randomizes variables b and a while DART can 
avoid the randomization of a as its value depends only on b' s value. We did not evaluate DART itself in 
this paper. 

3 Evaluation 

We evaluate the proposed solvers with two sets of experiments. The first compares the solvers we pro- 
posed and also the symbolic solver CVC3 [1] using a set of constraints written independently by the 
authors. The second compares the solvers using constraints generated from the concolic execution [36] 
of data-structures from a variety of sources. 

3.1 Microbenchmark 

The microbenchmark consists of 51 satisfiable constraints. We included 15 constraints with only linear 
integer arithmetic, 7 using the absolute value operator (not supported natively on CVC3), 5 using modulo 
and division (undecidable), 22 using non-linear integer arithmetic (undecidable), and 2 using floating- 
point arithmetic. Except for CVC3, we run each solver 10 times with different random seeds, using the 
range of values [-100,100], and a timeout of 1 second. We selected these input parameters arbitrarily. 
The experiments show that, except for the symbolic solver CVC3, the average recall of each solver was 
roughly the same: minimum average recall is 0.85 for BVF and maximum average recall is 0.92 for PSO. 
As expected CVC3 could not solve most of the constraints in this microbenchmark. It solved 21 out of 
the 52 constraints. But note that it could solve some special cases of undecidable constraints (only 15 
decidable constraints in the microbenchmark). For each constraints except two (one involving non-linear 
integer arithmetic and the other floating-point) there was a solver that can solve it. 

3.2 Concolic execution 

Subjects and Setup. We used data-structure from a variety of sources, bst (PI) is an implementation 
of a binary search tree from Korat [11]. filesystem (P2) is a simplification of the Daisy file system [32]. 
treemap (P3) is a jdkl.4 implementation (java. util .TreeMap) of red-black trees, switch (P4) refers 
to one example program from the jCUTE distribution [36]. ratpoly (P5) is an implementation of 
rational polynomial operations from the Randoop distribution [30]. rationalscalar (P6) is another 
implementation of rational polynomials from the ojAlgo library [3]. newton (P7) is an implementation 
of the newton’s method to iteratively compute the square root of a number [2]. hashmap (P8) is a 
jdkl.4 implementation (java. util .HashMap) of a map that uses hash values as keys. This experi- 
ment uses a concolic (concrete and symbolic) execution [36] to generate constraints for the subject 
programs described above. A concolic execution interprets the program simultaneously in a concrete 
and symbolic domain. On the one hand, the use of a concrete state enables a concolic execution to 
evaluate deterministically any program expression. This provides a means to handle infinite loops 
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Input: parameterized test ptest 
Input: random seed 5 , range [lojii] 

1: iv 4= random! vars(ptest), range) 

2: result 4= { iv } 

3 : pcs 4 = pcs + run( ptest, iv) 

4: while sizc( pcs) > 0 do 
5: iv 4= sol vc(pickOnc( pcs), s, range) 

6: if iv empty then 

7 : result 4= result U { iv } 

8: pcs 4= pcs + mn(ptest, iv) 

9: end if 

10: end while 
11: return result 

Figure 6: Concolic Execution Driver 
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Figure 7: Cell shows recall for each pair subject 
(row) and solver (column). SI and S4 correspond 
to our baseline solvers. 


and recursion, exploration of infeasible paths, and array indexing; which are typical limitations of a 
pure symbolic execution. On the other hand, the use of a symbolic state (which the concrete state is 
an instance of) enables a concolic execution to collect constraints that lead to non-visited paths along 
the execution of one concrete path. Figure 6 shows the pseudo-code of a test driver for a concolic 
execution. The driver takes as input (in addition to random seed and range of values) any procedure 
with parameters ptest and outputs inputs to ptest (that will lead execution to its different program 
paths). One iteration of the main loop explores one concrete path and produces several path constraints 
(corresponding to non-visited paths along that concrete path). A solution to a constraint, when found, 
will drive the next concolic execution of ptest (line 8). The operation solve at line 5 calls each solver 
with a 300 milliseconds timeout (based on average time from the microbenchmark). We set the 
overall timeout to 30 minutes. The concolic execution of the first four subjects only generates integer 
linear constraint, while the others construct non-linear constraints and unsupported constraints to CVC3. 

Discussion. We use the following identifiers to label solvers: Sl=ranSOF, S2=GA, S3=PSO, S4=CVC3, 
S5=GCF, S6=BCF and S7=BVF. Figure 7 shows a summary of the results. For the first 3 subjects the 
symbolic solver (S4) and consequently all hybrid solvers showed roughly the same average recall. Note 
that all constraints passed to the solver in this case are decidable. For switch (P4) which also builds 
decidable constraints, S4 timeouts often. For the last 4 subjects, S4 can rarely find a solution. In these 
cases, the search-based algorithms performed better on average. However, we observed that often one 
solver find solutions when the other misses. Figure 8 makes a pairwise comparison of the solvers. 
Fine and row denote identifiers of solvers. A cell on line i and column j indicates that solver i solves 
a constraint that j misses. Note that, for the 4 experiments at the bottom and switch, the solvers vary 
significantly in the set of constraints they can solve. These results confirm our expectations that the 
solvers are complementary. It suggests that one may not be able to predict the heuristic that will fit best 
for a particular subject; it is preferable to run them all in parallel. 

Impact of timeout in recall. Efficiency is important to enable symbolic testing: the number of queries 
submitted to the solver can be very high. One way to deal with this issue is to reduce the alloted time for 
constraint solving. However timeout reduction can reduce recall. To observe the impact of timeout in 
recall, we run each concolic execution experiment using timeouts from 100 to 500ms. We observed that 
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Figure 8: Results of various solvers for constraints that concolic execution generates. Column and row 
show solver identifiers. A cell denotes the difference of constraints that a solver (from row) can solve 
and another (from column) cannot. The bottom line summarizes the results. 
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CVC3 typically finds a solution for decidable constraints in less than 100ms. However, for the Switch 
experiment the recall of CVC3 was 0.25 for 100ms, 0.75 for 200ms and 0.95 from 300 to 500ms. We 
could not determine the reason for this. In particular, we did not find a strong correlation between the 
size of the constraints or the number of variables in it and higher impact of time. We also observed a 
significant variation in recall on PSO for Newton and HashMap and on GA for TreeMap and HashMap. 
For these cases, we conjecture that the impact of time relates to the complexity of the search problem, 
i.e., the relative small size of the solution space compared to that of the search space. 

4 Related Work 

Random-symbolic testing has been widely investigated recently to automate test input generation [22, 
26, 21 1. It alternates concrete and symbolic execution to alleviate their main limitations. It is important 
to note that random-symbolic testing provides two orthogonal contributions: (i) constraint generation 
and (ii) constraint solving. Our goal is to improve constraint solving. In this context, DART [22] concep- 
tually uses a random solver to simplify symbolic solving. We plan to evaluate the solvers we proposed 
with a DART solver as discussed in Section 2. Another approach to automate test input generation is 
random testing [10, 18, 31, 29]. The ranSOL solver differs from random testing in two important ways: 
(a) random testing generates inputs for program parameters; a classification of good input depends on 
the result of an actual execution, and (b) random testing typically generates test sequence and data si- 
multaneously. We plan to combine random sequence generation together with random- symbolic input 
generation to automate testing. 

We used the Satisfiability Modulo Theories (SMT) [37, 20, 12] solver CVC3 [11, which uses built-in 
theories for rationals and integer linear arithmetic (with some support to non-linear arithmetic). SAT 
solving research of undecidable theories has focused on the analysis of hybrid and control systems, as 
recently evidenced by the iSAT [19] and the ABSolver [7] systems. The first integrates the power of SMT 
solvers to solve boolean constraints with the capability of Interval Constraint Propagation (ICP) [9] to 
deal with non-linear constraint systems, while the second uses a DPLL-based [15] algorithm to perform 
the search and defers theory problems to subordinate solvers. As in hybrid and control systems, undecid- 
able theories also arise in the domain of software systems. This paper shows simple algorithms that can 
be effective to solve both decidable and undecidable fragments of constraints that a concolic program 
execution generates. Another distinguishing feature of our solvers is that, in contrast to a DPLL(T) [20] 
solver, they are not dependent on a background theory T. One can use the solvers this paper describes in 
combination to any theory-specific solver to fully benefit from their complementary nature. 

There are variations to the search-based solvers presented in Section 2 which we plan to investigate. 
Ru and Jianhua propose a hybrid technique which combines GA and PSO by creating individuals in a 
new generation by crossover and mutation operations [34]. Instinct-based PSO adds another criterion 
(the instinct) to influence a particle’s behavior [4]. The instinct represents the intrinsic “goodness” of 
each variable of a particle’s candidate solution. We also plan to analyze how test inputs generated from 
our solvers compare to those generated directly with a PSO algorithm whose fitness function is based on 
coverage [38]. 

5 Conclusions 

This paper proposes and implements a plain random solver, three hybrid solvers combining random 
and symbolic solvers, and two heuristic search solvers. We use a random solver and a symbolic solver 
(CVC3) as baselines for comparison. We evaluate the solvers on two benchmarks. One with constraints 
the authors constructed and the other with constraints that a concolic execution generates on 8 subjects. 
For the concolic execution on subjects that generated only decidable constraints the the experiments 
reveal as expected that CVC3 is superior in all but 2 cases. CVC3 timed out in these cases. For solving 
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undecidable constraints, no solver subsumes another. It suggests that one may not be able to predict the 
heuristic that will fit best for a particular subject; it is preferable to run them all in parallel. 

Next we want to analyse several open source projects to quantify the number of constraints that 
would produce undecidable constraints. We believe this is a necessary step to provide evidence for the 
practical relevance of this research. 
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